Add a pull-style streaming select API#471
Add a pull-style streaming select API#471iskakaushik wants to merge 1 commit intoClickHouse:masterfrom
Conversation
pg_clickhouse needs to consume select results one block at a time, but clickhouse-cpp only exposes a callback-driven select path today. That forces downstream users to layer coroutines or connection resets on top of the client when they need pull-style iteration. Add BeginSelect(), ReceiveSelectBlock(), and EndSelect() to mirror the existing multi-step insert workflow. The implementation reuses the existing query and packet handling code, keeps Query callbacks active for progress, profile, and log packets, and drains canceled queries so connections remain reusable. Add integration tests that cover full streaming iteration, preserved Query callbacks, early cleanup, end-of-stream reuse, and exception cleanup with subsequent reuse.
|
@iskakaushik master seem to build fine, can you take a look at CI errors. |
slabko
left a comment
There was a problem hiding this comment.
After some back and forth, implementing a ReceivePacket that only parses a single packet and returns a tagged enum (std::variant) does not seem particularly complex. The current ReceivePacket can then be built on top of it.
This gives full control over the ExecuteQuery loop, allowing a synchronous implementation without workarounds. In other words, the library can become synchronous while still preserving the asynchronous (callback-based) API.
This also enables synchronous handling of other events.
Finally, clickhouse::Query callbacks should not be ignored in synchronous mode—they can still be invoked, making the implementation complete.
There will be some extra work to be done, but it will be clear what after experimenting more with synchronous version of ReceivePacket
| bool inserting_; | ||
| bool inserting_ = false; | ||
| bool selecting_ = false; | ||
| bool discarding_select_data_ = false; |
There was a problem hiding this comment.
I believe this will not be needed if after proper synchronous version of ReceivePacket is implemented.
| ServerInfo server_info_; | ||
|
|
||
| bool inserting_; | ||
| bool inserting_ = false; |
There was a problem hiding this comment.
A bunch mutually exclusive of bools look time like a good candidate for a enum.
|
@iskakaushik Regarding API naming, for a pull-based API I would use Based on that it is relatively easy to return data to the caller directly and allow interactive control over the loop. This is where I would start. |
pg_clickhouse needs to consume select results one block at a time, but clickhouse-cpp only exposes a callback-driven select path today. That forces downstream users to layer coroutines or connection resets on top of the client when they need pull-style iteration.
Add BeginSelect(), ReceiveSelectBlock(), and EndSelect() to mirror the existing multi-step insert workflow. The implementation reuses the existing query and packet handling code, keeps Query callbacks active for progress, profile, and log packets, and drains canceled queries so connections remain reusable.
Add integration tests that cover full streaming iteration, preserved Query callbacks, early cleanup, end-of-stream reuse, and exception cleanup with subsequent reuse.